Entropy-based Scheduling Policy for Cross Aggregate Ranking Workloads

نویسندگان

  • Chengcheng Dai
  • Sarana Nutanong
  • Chi-Yin Chow
  • Reynold Cheng
چکیده

Many data exploration applications require the ability to identify the top-k results according to a scoring function. We study a class of top-k ranking problems where top-k candidates in a dataset are scored with the assistance of another set. We call this class of workloads cross aggregate ranking. Example computation problems include evaluating the Hausdorff distance between two datasets, finding the medoid or radius within one dataset, and finding the closest or farthest pair between two datasets. In this paper, we propose a parallel and distributed solution to process cross aggregate ranking workloads. Our solution subdivides the aggregate score computation of each candidate into tasks while constantly maintains the tentative top-k results as an uncertain top-k result set. The crux of our proposed approach lies in our entropy-based scheduling technique to determine result-yielding tasks based on their abilities to reduce the uncertainty of the tentative result set. Experimental results show that our proposed approach consistently outperforms the best existing one in two different types of cross aggregate rank workloads using real datasets.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Search-based Parallel Job Scheduler

To balance performance goals and allow administrators to declaratively specify high-level objective, we have proposed a goal-oriented scheduling framework by designing an objective model and a scheduling policy based on combinatorial search techniques to achieve the objective. In this work, we further evaluate our new policy on various real workloads including (1) ten monthly workloads that ran...

متن کامل

Ecological Efficiency Based Ranking of Cities: A Combined DEA Cross-Efficiency and Shannon’s Entropy Method

In this paper, a method is proposed to calculate a comprehensive index that calculates the ecological efficiency of a city by combining together the measurements provided by some Data Envelopment Analysis (DEA) cross-efficiency models using the Shannon’s entropy index. The DEA models include non-discretionary uncontrollable inputs, desirable and undesirable outputs. The method is implemented to...

متن کامل

Scheduling Multiple Data Visualization Query Workloads on a Shared Memory Machine

Query scheduling plays an important role when systems are faced with limited resources and high workloads. It becomes even more relevant for servers applying multiple query optimization techniques to batches of queries, in which portions of datasets as well as intermediate results are maintained in memory to speed up query evaluation. In this work, we present a dynamic query scheduling model ba...

متن کامل

Optimum Aggregate Inventory for Scheduling Multi-product Single Machine System with Zero Setup Time

In this paper we adopt the common cycle approach to economic lot scheduling problem and minimize the maximum aggregate inventory. We allow the occurrence of the idle times between any two consecutive products and consider limited capital for investment in inventory. We assume the setup times are negligible. To achieve the optimal investment in inventory we first find the idle times which minimi...

متن کامل

Cycle Time Optimization of Processes Using an Entropy-Based Learning for Task Allocation

Cycle time optimization could be one of the great challenges in business process management. Although there is much research on this subject, task similarities have been paid little attention. In this paper, a new approach is proposed to optimize cycle time by minimizing entropy of work lists in resource allocation while keeping workloads balanced. The idea of the entropy of work lists comes fr...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016